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Abstract 

We  consider  the  sequential  estimation  of  the  largest  mean  of  k populations 
when  the  observations  are  normally  distributed  with  a common  unknown  variance 
and  the  goal  is  to  control  the  mean  square  error  (MSE)  at  a prespecified  level; 
this  is  a generalization  of  problems  considered  by  Blumenthal  (1976)  and  Carroll 
(1977).  By  eliminating  from  the  experiment  populations  which  the  data  indicate 
are  not  associated  with  the  largest  mean,  it  is  shown  that,  compared  to  existing 
procedures,  significant  savings  in  sample  size  can  be  obtained.  Weak  convergence 
results  are  obtained  for  the  stopping  times  and  the  estimate  of  the  largest  mean 
as  consequences  of  more  general  results;  these  are  used  to  compute  the  aspiptotie 
MSE. 
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1 . Introduction 


Let  0j,...,0,  be  the  unknown  means  of  k normal  populations  with  common 
2 _ _ 

unknown  variance  a , and  let  be  the  sample  means  for  n observations 

taken  from  the  k populations.  Define  the  ordered  population  and  sample  means 

by  9 ril  < ...  < 9r,  i and  Xri1  < ...  < X,.  . . Blumenthal  (1973,  1976) 

[1]-  - [k]  [l]n  - - [k]n 

constructed  sequential  procedures  for  estimating  the  largest  mean  0 ^ with  a 

prespecified  bound  r on  the  mean  square  error  (MSE) . His  procedures  were  mildly 

data  sensitive  in  that  they  depend  on  estimates  A.  = Xr.  . - Xr>1  of 

1 v in  [k]n  [i]n 

A^  = and  be  obtained  some  partial  asymptotic  results.  Carroll 

(1978a)  extended  the  asymptotic  results  when  o is  known. 

The  purpose  of  this  paper  is  twofold.  In  Section  2 we  consider  weak  conver- 
gence results  which  greatly  extend  the  asymptotic  theory  for  Blumenthal's 
stopping  time  Ng  and  generalize  the  results  of  Carroll  (1978a).  We  define  and 
study  the  weak  convergence  of  a stopping  time  process  {Nr)  that  includes  Ng  as 
a special  finite-dimensional  case.  We  then  define  anew  random  change  of  time 
process  for  sample  means  that  is  based  on  {Nr>  and  consider  its  weak  convergence. 
We  finally  study  the  limit  distribution  and  MSE  for  the  maximum  sample  mean  upon 
stopping. 

The  second  purpose  of  this  paper  is  based  upon  our  belief  that  Ng  can  be 
made  more  data  sensitive  and  efficient  by  the  simple  expedient  of  grafting  onto 
it  the  ability  to  eliminate  populations  (early  in  the  experiment)  which  are 
obviously  not  associated  with  and  hence  give  no  information  about  0 . In 

Sections  3 and  4 we  define  the  elimination  procedure,  study  its  asymptotic  behav- 
ior and  give  some  Monte-Carlo  results  which  show  that  the  savings  (over  Ng)  in 
sample  size  can  be  considerable. 
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To  fix  notation,  define  = 1 and  let  (a  /niH^Cn^Aj^/a,  , . , ,n  2Ak  j/o)  be  the 

MSE  due  to  estimating  0 r,  . by  0*  = Xr,,  , In  order  to  control  MSE  at  a level  r, 

[kj  n LKJn 

2 

when  a is  known  the  following  procedures  have  been  proposed:  take  observa- 
tions from  each  population,  where  either  (Blumenthal  (1973)) 

(1.1)  N*  = N* (k)  = inf{n:nr  ^ a2Hk(n^Ain/a,  • . •>n55Ak_1  n/o)}, 

or  (Blumenthal  (1976)) 


(1.2) 


N*  = N*(k)  = inf{m:  n(m)  < m},  where 

n(m)  = inf  (t:  rt  > o\{t\n, . . . , t\.j  >m)  } • 


Although  an  analogue  of  (1.2)  (for  the  case  a unknown)  is  possible  and  requires 
tn-  same  basic  techniques  as  used  here,  for  notational  reasons  we  prefer  to 
consider  (1.1)  and  take  observations  from  each  population,  where 

(1.3)  Nb  - NgOO  = i„f(n  > ar"1:  nr  > o^k  Hk(n\m/onk "‘!\.I>„/'Jnk)  )■ 

Here  a = a(r)  > 0 are  a set  of  small  bounded  constants  with  finite  positive  limit 
-1  . 2 

a^  and  with  ar  an  integer,  while  ank  is  the  usual  pooled  sample  variance  with 
k(n-l)  degrees  of  freedom. 

We  make  the  following 


Assumptions . The  i.i.d.  observations  from  the  jth  population  have 

finite  fourth  moment.  The  functions  Hk  are  continuous  and  satisfy 

0 < H . < H.  (x, , . . . ,x.  ,)  < H < Further,  for  every  k and  p, 

min  — k 1 ' k-1  — max  * 1 


1 im 


Hk(x1,...,xk_1)  = Hk_p(xp+1 xkl) 

Finally,  for  every  k and  u,  the  Lebesgue  measure  of  the  set 


x,  , . . . ,x  **> 
1 P 


{(*2»«..f xk_ j ) : Hk (x^ xk  j)  = uf 


is  zero. 
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If  the  observations  are  normally  distributed,  the  assumptions  hold  for 

the  MSE  function  H,  . We  take  0 < ka_  < H . < H , and  it  will  simplify  computa- 
k 0 min  max  r r 

tions  without  affecting  results  if  we  take  H =1.  With  k=2,  define  > 

max  n l>jn  [ijn 

2 


A - 0|^]  - and  without  loss  of  generality  take  9^  < 0 ^ and  a 


1. 


2.  Weak  Convergence  Results  for  Ng 

In  this  section  we  prove  a number  of  weak  convergence  results  for  a stopping 
time  process  that  includes  Ng  as  a special  (finite  dimensional)  case.  All  results 
are  given  for  k=2  but  are  easy  to  generalize  to  k>3.  To  outline  important 
special  cases  of  the  results,  in  Lemma  1 we  give  the  limit  distribution  of  rNg, 
in  Lemma  2 we  discuss  a random  change  of  time  for  sample  means,  and  in  Lemma  3 

we  establish  the  limit  distribution  of  the  maximum  sample  mean  upon  stopping.  We 

8 2 2 
assume  throughout  this  section  that  A n.  rp  for  some  0 £ g < °°  and  A /r  -*■  Hq 

(0  £ Hq  £ °°) . Let  W be  Brownian  motion  with  mean  zero  and  variance  2t  at  time  t, 

and  define 

W*(s,nQ)  = s'  2 | W (s)  + snQ|  . 


Letting  [•]  denote  the  greatest  integer  function,  define  a stochastic  process 


Cr(s)  = [s/r]  2 A[s/r]/o[s/r]  . 

The  proofs  of  all  results  are  delayed  to  the  end  of  the  section. 


Proposition  1.  Let  0 < bj  < b^  < 00  and  "■=>"  denote  weak  convergence.  On  the 
space  D[bj,b1]  (Billingsley  (1968)), 


Gr  w*(-,n0)  (0>’J 


G_  P, 


(«  ••  \)  . 
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In  studying  the  stopping  time  Ng,  we  have  found  a more  general  approach  to 
be  as  convenient  and  to  yield  much  stronger  results.  Consider  processes  for 
0 < t < 1 given  by 

N (t)  = inf{m  > ar'1:  mr(t+l)  > a2  H (m‘  A /t  )} 
r — — m 2 m m 

Q(t)  = inf{s  _>  a:  s(t+l)  _>  H2(W*(s,n0)) ) . 

Note  that  Nr(C)  = Ng.  Both  Nr  and  Q are  monotone  non-increasing  in  t and  are 
easily  verified  to  be  members  of  D[0,1].  Lemma  1 is  comparable  in  spirit  to  work 
of  Gut  (1975) . 

Lemma  1 . (Weak  convergence  of  the  stopping  time  process) . For  B < 4. 

TNg  = rNr(0)  + 1.  For  3 > 4, 

(2.1)  rNr  =>  Q on  D{0,1]  . 

Define  C*  by 

(2.2)  G*  (u)  = Pr(Q(t)  > u}  = Pr{  (H2(K’*  (s , nQ) ) - s(t^l))  > 0 for  al  1 a <_  s < u}  . 

Since  2a  < H . and  H is  bounded,  it  is  easy  to  show  that  1-G*  is  a proper  distri- 
min  t r r 

bution  function.  Of  more  interest  is  the  following  result. 

Corollary  1.  (Distribution  of  Blumenthal 's  stopping  time).  In  Lemma  1 for 

8 > 4, 

Pr( rNg  > u)  » G«(u)  « Pr{Q(0)  > u). 

The* next  result  will  be  useful  in  discussing  the  limit  distribution  of  the 
larger  sample  mean  » hen  sampling  is  stopped,  but  it  is  quite  general  and  may  be 
of  some  interest  in  its  own  right  for  the  following  reasons.  Typical  results  in 


the  theory  of  weak  convergence  with  random  indices  (Durrett  and  Resnick  (1976) 
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have  a nice  review)  start  with  processes  {V^}  in  D[0,®)  and  a sequence  of  integer- 
valued random  variables  and  consider  the  process  Vr(trMr)  on  D[0,®),  where 
rMf  has  a limit  distribution.  In  other  words,  Vr  is  perturbed  by  a "random  time 
change"  proportional  to  rMr>  In  the  next  result,  we  allow  the  random  time 
change  rMr  to  be  itself  a stochastic  process.  Define  m = [t/r]  and  for  j = 1 , 2 let 


. r'  a (X.J  - ep ■j„,  j 


(t/r  - m) (X. 


The  processes  are  elements  of  C[0,°°)  with  weak  limits  V 


- 0,)> 

U) 


Lemma  2 . 


(Weak  convergence  for  random  change  of  time  processes) . 


Define  processes 


» 

i 


(2.3) 

on  D^ { 10,®)  x [0,1]} 

where  W(^(s,t)  = 


W^-*(s,t)  = V^(srNr(t)) 
(Bickel  and  Wichura  (1971)). 

W<«  rNr)  =>  (W(1),  W 
(sQ(t) ) . 


Then  for 


S > 


The  last  Lemma  will  be  shown  useful  when  we  discuss  the  specific  proposal 
for  eliminating  the  inferior  population  early  in  the  experiment.  It  is  a simple 
Corollary  of  Lemma  2 which  delineates  the  asymptotic  behavior  of  the  larger  sample 
mean  when  the  number  of  observations  are  approximately  Ng. 

Lemma  3.  (Limit  distribution  of  the  larger  sample  mean).  Let  and  be 

the  number  of  observations  and  the  sample  mean  after  observations  on  the  jth 

population,  where 

1 I Mrj)/NB  = M[j)/Nr(0)  5 l (j  = 1,2)  . 

Let  0*  = maxlX^,  X^2)}.  Then,  for  0 j> 


L 
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r',J(0;  02)  > Q(O)'1  max{V(1)(Q(0))  - n0>  V(2)(Q(0))}. 

Proof  of  Proposition  1 : The  process  G^Cs)  can  be  written  as 

(2.4)  Cr(>)  . I(s/rj'  (X2|s/r]  - X1[s/rJ  - e2  * Sj)  * [s/rp  A|/o[s]l]  . 

The  denominator  of  (2.4)  converges  almost  surely  to  a = 1 while  the  numerator 
converges  weakly  to  W*(-,  r)Q) , completing  the  proof.  □ 

Proof  of  Lemma  1 : We  first  prove  (2.1)  by  verifying  tightness  and  the  conver- 
gence of  the  finite  dimensional  distributions.  Note  first  that 

(2.5)  Pr{rNr(t.)  > u.  (i  = l,...,p)> 

2 J < 

= Pr{mr(t.+1)  < o H-(m2A  /a  ) for  all  a < mr  < u.  (i  = l....,p)} 
l m Z mm  — — x 

= Pr{[s/r]r(ti+l)  < H2(Gr(s))  tor  all  a < s < u.  (i  = 1 , . . . ,p) } , 

the  last  equation  using  the  facts  that  ar  1 is  an  integer  and  2a  < . 

Rewrite  (2.5)  as 

(2.6)  Pr{rNr(t.)  > ^ (i  = l,...,p)} 

= Pr{  inf  (a2  . H (G  (s))  - [s/r]r(ti+l) ) >0  (i  = l,...,p) 

a<s<u.  I ' J 

l 

From  Proposition  1,  the  continuous  mapping  theorem  (since  inf  is  continuous  in 
this  context)  and  Theorem  2.1  of  Billingsley  (1968),  (2.6)  shows  that  as  r + 0, 

lim  inf  Pr{rNr(t^)  > u^  (i  = l,...,p)} 

>_  Pr{Q(ti)  > ui  (i  = 1,  . . .,p) } , 

thus  verifying  the  convergence  of  the  finite  dimensional  distributions.  To  prove 
tightness,  we  appeal  to  Theorem  15.2  of  Billingsley  (1968).  The  first  condition 
of  his  Theorem  is  satisfied  in  our  case  because  the  process  rN^  is  non-increasing 
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and  rN  (0)  has  been  shown  to  have  a limit  distribution.  To  check  the  second 
r 

condition  of  Billingsley's  Theorem  15.2,  we  must  show  (in  his  notation)  that  for 
all  e > 0, 

(2.7)  lim  lim  Pr{cj'  (6/2)  _>  e}  =0 

5-K)  r->0  r 

Now,  since  rN^  and  Q are  non-increasing, 

(2.8)  lim  Pr {w^N  (6/2)  > e} 

r-K)  r 

< lim  Pr{rN  (i6)  - rN  ((i  + l)6)  > e for  some  i = 0, 1 , . . . , [1/6] } 

r-cO  r 

< Pr{Q(i6)  - Q((i  + 1)6)  c for  some  i = 0, 1 , . . . [i/6]  } 

£ Pr{u^(46)  _>  e/4}  , 

the  next  to  last  inequality  following  by  the  weak  convergence  of  the  finite  dimen- 
sional distributions,  while  the  last  follows  because  Q is  non-increasing.  Then  (2.7) 
follows  from  (2.8)  because  Q is  an  element  of  D[0,1]. 

The  rest  of  Lemma  1 (3  < ?j)  follows  in  a similar  but  easier  fashion. 

Proof  of  Corollary  1:  By  Lemma  1,  we  need  only  show  that  G*  is  continuous. 

Letting  1 (A)  be  the  indicator  of  the  event  A, 

lim  I { Q ( 0 ) > u + e}=  l{Q(0)  > u} 
ekO 

lim  I {Q (0)  > u + e}=  l{Q(0)  > u}  + l(ll  (W*(u,q  ))  - u = 0}  . 
e+0  1 ° 

Now,  by  assumption,  Pr{H2 (W* (u, Hq) ) = u}  = 0,  so  that  G*  is  continuous.  □ 

In  order  to  prove  Lemma  2,  we  need  the  following  supplementary  results.  For 
intervals  T^,  T0  of  the  real  line,  we  define  D^T^xT^}  to  be  the  space  of  func- 
tions x(s,t)  (seTj,  tcT2)  which  are  continuous  from  above  with  limits  from 
below  (Bickel  and  Wichura  (1971)). 
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Proposition  2.  Let  a,b  be  arbitrary  positive  numbers  and  define  D0(0,a]  to  be 
the  set  of  functions  c|)  in  D[0,a]  which  are  nonincreasing  and  satisfy 
0 4>(t)  b for  0 t a.  Let  {V^}  be  elements  of  C[0,°°),  be  elements  of 

Dq  [0, a]  and  suppose  there  exists  an  element  (V,$0)  in  C[0,oo]  x DQ(0,a]  satisfying 

<V  V ->  <V-V  ■ 

Define  random  elements  {V*},  V*  of  D2([0,<»)  x [0,a]}  by  V*(s,t)  = V^fs  <^(t)). 
Then  on  D2{  [0,°°)  x [0,a]}, 

V*  •=>  V* 
n 

Proof:  Define  V^^(s,t)  = V^(s,t),  V^(s,t)  = V(s,t),  so  that  V*'1'*  are 

elements  of  C2[0,°°].  Denote  by  A the  space  C2i[0,<»)  x [0,a]}  x DQ[0,a]  and 
define  a function  h:  A -+  D2([0,°“)  x [Q,b]}  by 

h(x,v) (s,t)  = x(s,v(t))  . 


This  is  shown  to  be  a measurable  mapping  following  Billingsley  (1968,  page  232). 

Since  V*  = h(V(1),  4>  ) and 
n n ’ n 


(V. 


(1) 


«D  ) ->  (V 


(1) 


$ 1 
0J 


the  continuous  mapping  theorem  completes  the  proof  once  we  show  that  h is  contin- 
uous at  elements  of  A.  Let  (x  , <j>  ) -*■  (x,<)>)  c A.  Then  there  exists  functions  A 

n n n 

mapping  [0,a]  into  [0,a]  such  that  for  every  c > 0, 


sup  { |xn(s,t)  - x(s,t)  | : 0£s£c,  0 ^ t £ a)  >0 
sup{max[  |4>nCAn(t))  - <p(t)  | , | An(t)  - 1 1 ] : 0 < t <a)  + 0 . 

These  two  facts  imply  that  for  every  cfl  > 0, 
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sup { |xn(s>  <J>n(An(t)))  - x(s,<J>(t))  I 0 1 s £ co>  o £ t £ a}  -+  0 . 

Since  c > 0 is  arbitrary,  following  Lindvall  (1973)  and  Whitt  (1970),  this 
shows  that  h is  continuous  at  (x,<j>)  and  completes  the  proof.  f] 

Proposition  3.  On  C [0,°°)  x C [0,°°)  x D[0, 1] , if  3^4  » 

(2.9)  (V^,  V^2),  rNr)  =>  (V(1),  V(2),  Q)  . 

Proof:  By  Lemma  1,  each  of  the  elements  of  (2.9)  are  individually  and  hence 

jointly  tight,  so  it  suffices  to  prove  convergence  of  the  finite  dimensional 
distributions.  We  show  this  only  for  a special  case,  noting  that 

(2.10)  PrfV^tj)  > Uj,  V^(t2)  > u2,  rNr(t3)  > u3) 

= PrlV*;1-*  (tj)  > Uj,  v£2^(t2)  > u2’  inf  s/rlH2^Gr^S^  " ts/rMt3+1) ) >0}  • 

a<s<u  L J 

Wc  assume  with  no  loss  of  generality  that  0 <_  t^,  t2>  u3  <_  1 . Since  fourth  moments 
are  finity,  for  any  u 

(2.11)  sup{r  5 | X - 6j  | : l£m<^ur  * } ^0. 

This  shows  that  the  second  term  in  the  definition  of  is  negligible,  so  that 

(since  G is  a continuous  function  of  the  first  terms  in  v£-^),  on  C [0, 1 ]x  C [0, 1 ]x  D [a,  1] , 

(vj;1),  V^2),  Gr)  =>  (V(1),  v(2),  W*(-,n0))  • 

Thus,  as  r ->  0,  the  continuous  mapping  theorem  and  Theorem  2.1  of  Billingsley 
(1968)  show  that 

lim  inf  PHvj^ftj)  > Uj,  V^(t2)  > u2>  rN^tj)  > u3> 

> Pr{V(1)(tj)  > u j , V(2)(t2)  > u„  Q(t3)  > u3)  , 


which  proves  convergence  of  the  finite  dimensional  distributions  and  completes 
the  proof.  (1 
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Proof  of  Lemma  2:  The  boundedness  of  H2  means  that  with  probability  one  there 
exist  positive  numbers  a^,  a2  such  that 

< inf{Q(t):  0 <_  t £ 1}  < sup{Q(t):  0 £ t <_  1}  < a2  . 


Define  a process  Mf(t)  by 


rM^Ct)  = rN^Ct)  Iia^  <_  rN^Ct)  _<  a2} 

+ a2  I{rNr(t)  > a2l  + aJ  ItrN^ft)  < a^ } , 

and  define  Z^33(s,t)  = V^3 3 (srM^ft)) . By  an  extension  of  Proposition  2 and  by 
Proposition  3, 


rN  ) =>  (W 
r 


(1) 


Q)  • 


Now,  since  Nr  is  non-increasing, 

Pr{Mr(t)  ^ Nr(t)}  £ Pr{Nr(0)  > a2}  + Pr{Nr(l)  < aj}  ->  0 , 

so  that  Z333  - i/33  5 o.  An  application  of  the  continuous  mapping  theorem  and 
Theorem  4.4  of  Billingsley  completes  the  proof.  □ 

Proof  of  Lemma  3.  Calculations  and  (2.11)  show  that 


(2.12)  r'2(0*  - 0.) 

r 2 


,(2) 


'=  Vi'max{‘(rMj1,)t1'  [ (Xj.  - 0j)  + ^ - @2),  (rM^23)-1  J (X2-  - @2) } 

i=l  i=l 


= max 


{(rM^13)"1  V^13(rM^1})  + r’^CGj  - 02),  (rM^23)'1  V^2)  (rM^2) ) } ♦ op(l) 

M03  . fn  M^13  . M^2)  f2>  M^2) 

- max  3<rNB3  Vr  rV-r*A’  <-Sr3(rN  B3"  Vr  rV  3 + °P(1) 


By  Lemma  2,  the  processes  V333  (srNg)  are  elements  of  C[0,1]  which  are  tight  with 

(i)  P 

weak  limits,  so  that  since  NTJ  /Ng  ■*  1, 
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V^f-E — rN  ) - V^^frN  ) £ 0 
r *•  Ng  BJ  r iriV  * 

This  means  by  Lemma  2 that  the  elements  of  the  last  equation  in  (2.11)  are 
jointly  weakly  convergent,  so  that  an  application  of  the  continuous  mapping 
theorem  completes  the  proof.  □ 


3.  Eliminating  Populations 

The  difficulty  with  the  stopping  time  Ng  is  that  it  is  only  mildly  data 
sensitive  in  that  it  estimates  ^ but  continues  to  sample  from  popula- 

tions which  the  data  indicate  are  not  associated  with  the  largest  population  mean, 
i.e.,  it  fails  to  eliminate  inferior  populations.  A basic  method  for  correcting 
this  deficiency  is  to  use  the  technology  due  to  Robbins  (1970)  and  Swanepoel 
and  Geertsema  (1976).  Suppose  then  an  initial  sample  of  size  m is  taken. 

Define  t(a)  = m *(1  + b2/(m-l))m,  and  let  b = b(a)  satisfy  the  equation 

1 - F , (b)  + bf  , (b)  = a/(k-l),  where  F , (f  .)  is  the  distribution  (density) 
m-1  m-1  m-1 v m-1 

function  of  a t-distribution  with  m-1  degrees  of  freedom.  Define 
h(t(a),n)  = {(t(a)n)1/n  - 1 }z  and  let  s2(i,j,n)  = (n-l)-1  J (X.p-X.p-X.n+X.n)2. 
We  say  that  the  ith  population  is  eliminated  at  stage  if  it  has  not  been 
eliminated  at  any  stage  n < NL  and  if,  when  populations  jj,...,jp  also  have  not 
been  eliminated  before  stage  NL,  we  have  for  some  j e {j^,...,jp}  that 


(3.1) 


XiM.  ■ XiM.  > M.)S(i,j,M.)  . 


Assuming  j > 0,  CL  = O^j  and  an  initial  sample  size  m,  the  previously  cited 
works  show  that 


(3.2) 


Pr(NL  > M.  for  all  i / j } > 1 -a. 


In  other  words,  the  probability  is  at  most  a of  eliminating  the  population  with 
the  largest  mean.  We  believe  the  choice  m=5  initial  observations  will  work  quite 
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well.  The  stopping  times  we  consider  are  then  defined  formally  as  follows: 
choose  a (see  below)  and  take  an  initial  sample  of  size  max(5,ar  *)  from  each 
populat ion . 

Definition.  Reorder  the  populations  so  that  <_  M2  ^ ...  < M^>  the  ordering  in 

case  of  ties  being  by  sample  means.  If  Ng(k)  M^,  take  Ng(k)  observations  from 

each  population.  Otherwise,  completely  eliminate  the  first  population  from 
further  study  and  continue  as  if  there  were  k-1  populations  in  the  experiment 
(this  includes  changing  the  values  of  to  ^ and  a ^ to  o ^ but  the  value 
of  t (a)  in  (3.1)  remains  unchanged).  Then,  if  NR(k-l)  <_  take  Ng(k-l)  obser- 
vations from  each  population;  otherwise  eliminate  the  second  population.  Continue 
in  this  manner  until  stopping,  denoting  the  number  of  observations  on  each  popu- 
lation by  (Nj  <_  N'2  <_  . . . <_  N^)  = N,  with  total  sample  size  T = Nj  + ...  + . 

Note  that  Blumenthal’s  N = N„(k)  is  obtained  as  a special  case  by  choosing 

D D 

a = 0.  We  again  consider  only  the  case  k=2  and  define  M = minfM^.M^).  Recall 

that  A - r . The  next  result  shows  how  letting  a ■+  0 as  r -+  0 influences  M. 

The  proof  is  at  the  end  of  the  section. 

2 

Lemma  4.  (size  of  M)  Choose  a + 0 as  r + 0 so  that  b =2  log  t(a)  = r 
(0  < Bq  < h) • Then,  as  r + 0, 

rM  ^ aQ  if  8 < B() 

P 

+ 1 if  8 = 60 

5 00  if  8 > 8q  • 

Lemma  4 is  rather  confusing  at  first  sight.  Note  that  A = | 0^  - 6 j | ~ r , so 

the  smaller  the  value  of  8 the  farther  apart  the  means  are  and  the  quicker  one 

should  eliminate.  This  means  that  for  smaller  B»  rM  should  be  small,  as  Lemma  4 
shows.  The  constant  Bq  (a  monotone  increasing  function  of  a)  merely  serves  as  a 
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cut  off  point;  for  small  a and  hence  small  Bq,  it  becomes  harder  to  eliminate 
because  we  are  insisting  on  more  protection  (see  (3.2)). 


Lemma  5.  (Comparison  of  sample  sizes).  For  general  k,  let  T.  = kN  be  the 
— b B 

total  sample  size  of  the  Blumenthal  procedure.  Let  A^  ~ r (i  = l,...,k-l) 
and  let  p be  the  number  of  < Bq,  i.e.,  p is  the  number  of  populations  whose 
means  are  far  from  relative  to  a.  Then,  if  is  the  total  sample  size 

taken  by  the  elimination  procedure, 


Te/Tr  + 1 - (l-aQ)p/k  , 

where  a^  is  defined  immediately  following  (1.3). 


Lemma  5 shows  that  considerable  savings  in  sample  size  are  possible.  In 
the  next  section  we  show  that  this  is  accomplished  without  a corresponding 
increase  in  MSE. 

Proof  of  Lemma  4.  First  consider  3 < h-  Since 

2 2 2 2B0/B 
rM  = (A  M/2  log  t(a))(2r  log  t(a))/A  ~ (A  M/2  log  t(a))r  , it  suffices 

2 

to  prove  that  A M/log  t(a)  -*■  2.  Recalling  that  0^  < 0^  and  defining 

= X2n  - XJn  - A,  equation  (3.2)  shows  that  with  probability  approaching  one, 
M«M,  so  that  with  probability  approaching  one, 

(3.3)  Tm  ♦ A > h(t(a),M)  S(1,2,M) 

TM-1  + A - s C 1 , 2 » M)  • 

Using  the  facts  that  3 < \ and  M ar  *,  the  law  of  the  iterated  logarithm  shows 
P P 

that  Tm/A  ■*  0,  Tm  j/A  0.  Dividing  through  by  A in  (3.3)  and  noting  that 
2 

S (l,2,n)  ♦ 2 almost  surely,  a few  manipulations  show  that 

(3.4)  (log  t (a)  ♦ log  M)/A2M  S . 


t 
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-12  y 

Now,  since  M j>  ar  , 4M  >M  for  some  small  positive  y,  so  that  (3.4)  becomes 

2 P 

(log  t(a))/A  M -*•  Next  we  consider  the  case  3 _>  ’5  . In  (3.3),  divide  through 
by  r 2 ' , where  c > 0 is  sufficiently  small.  Since  A/r + 0,  some  manipula- 

tions yield 


(log  t(a)  + log  M}/Mr 


l-2e  P 


l-2c  Y 

Since  Mr  M for  some  y > 0,  this  gives 

(3.5)  Mr*  2c/log  t(a)  -*■  00  . 

2c  1-2g 

Since  rM  = r log  t(a)(r  M/log  t(cx)),  we  can  choose  e > 0 sufficiently 

2 c P 

small  so  that  r log  t(o)  °°  and  hence  by  (3.5),  rM  -*■  <*»,  which  completes  the 

proof. 


4 . Mean  Square  Error 

-1  2 

In  this  section  we  consider  the  MSE  r E(0*  * both  asymptotically  and 

in  a Monte-Carlo  study  for  small  sample  sizes.  Suppose  that  upon  stopping, 

N.  observations  have  been  taken  from  the  ith  population  (i  = 1,2).  Recall  that 

from  previous  considerations,  we  are  taking  0^  < 0 N^  > ar  *,  and  o=l . 

Blumenthal  and  Cohen  (1968)  indicate  that  even  for  Ng,  there  are  many  ways  of 

estimating  0r_,,  but  that  0*  = max(X,  , X.  ) is  a reasonably  effective  choice. 
[2]  n In  zn 

Our  stopping  time  employs  an  elimination  feature,  so  we  must  take  into  account 


the  possibility  that  an  eliminated  population  has  a sample  mean  (upon  stopping) 
larger  than  any  other  sample  mean  (upon  stopping).  The  estimate  we  choose  then 
is  given  by  * = max(X  , X ).  An  alternative  estimator  is  the  maximum  sample 

IN  1 1 ’ | fait  2 

mean  over  all  populations  which  have  not  been  eliminated,  but  we  have  been  unable 

to  verify  the  un  >1  integrabi lity  needed  in  the  proofs  to  follow.  The  cases 

ft  ft 

A - r (B  < !j)  and  A ~ rp(0  > 4)  are  different  and  are  treated  separately. 
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Lemma  6.  (Asymptotics  of  Q*  when  elimination  may  occur).  Consider  the  condi- 
tions of  Lemma  4 with  0 < 3q  < 0 6 < h . Then  the  limit  distribution  of 

- 1 > - 1 2 

r 2 (0*  - 0 ) is  the  standard  normal  and  r E(0*  - 0.)  -*•  1. 

N 2 N Z 

Lemma  6 says  that  for  k=2  if  the  population  means  are  sufficiently  sepa- 
rated, even  if  elimination  does  not  occur,  our  general  procedure  has  precisely 
the  same  asymptotic  behavior  (in  0*)  as  does  Ng.  The  next  result  shows  this 

g 

to  be  true  when  th^  means  are  not  separated,  i.e.,  0^  - 6j  = A ~ r , 3 V 
Note  in  this  case  hat  Lemma  4 says  that  elimination  will  probability  not  happen. 

Lemma  7.  (Asymptotics  of  0*  when  elimination  is  unlikely  to  occur).  Consider 
the  conditions  of  Lemma  4 with  3 > h-  Let  £ be  the  limit  distribution  in  the 

i 2 

conclusion  to  Lemma  3.  Then  r 2(0*  - 90)  =>  E£  exists,  and 

r"1  E (0*  - e2)2  - EC2  . 

The  same  results  hold  if  N„  is  used  without  elimination. 

D 

The  results  of  Lemma  6 and  Lemma  7 are  rather  unusual,  in  that  they  say 
that  for  k=2  the  simple  elimination  idea  employed  here  can  save  the  user  in 
terms  of  sample  size  with  no  (asymptotic)  change  in  MSE.  In  order  to  see  how 
this  works  with  small  samples,  we  ran  a Monte-Carlo  experiment  with  500  simula- 
tions. The  complete  results  are  reported  in  Carroll  (1978b),  but  here  we  con- 
sider a = .01,  r = .10,  .01  and  A = 2.00,  1.00  and  .20.  An  initial  sample  of 

size  m=5  was  chosen  as  suggested  in  Section  3.  The  results  are  given  below, 

with  Tp/Tg  being  the  ratio  of  sample  size  needed  for  the  elimination  procedure 
relative  to  Blumenthal's  procedure. 
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A=2 ,00 

A=1 ,00 

A=  ,20 

Vtb 

r 

= 

.10 

1.00 

1.00 

1.00 

r 

= 

.01 

.82 

.64 

1.00 

r'1  MSE 

for 

elimination 

r 

_ 

.10 

.86 

.89 

.72 

r 

= 

.01 

.93 

.90 

.63 

r"1  MSE 

for 

Blumenthal 

r 

3 

.10 

.86 

.89 

.73 

r 

= 

.01 

.93 

.77 

.63 

Apparently,  both  procedures  achieve  their  goal  of  controlling  MSE.  The  elimina- 
tion stopping  time  can  lead  to  substantial  savings  in  sample  size  while  achieving 
its  bound  on  MSE.  The  Blumenthal  procedure  appears  to  have  slightly  lower  MSE 
overall,  but  this  is  achieved  at  the  cost  of  increased  sample  size. 


Proof  of  Lemma  6.  By  Lemmas  1 and  4,  rN^  converges  in  probability  to  a constant 
(either  or  1 depending  on  6g,S).  Thus,  by  Anscombe  (1952,  Theorem  1)  the 
vector 

(4.1)  Cr'l‘S1Mj  - 6,).  - Of) 

converges  in  distribution  to  a normal  random  vector.  This  gives 

Pr{X_v  > X,.,  } -*  1 since  r~  ,i(9_  - 0.)  -*  ».  Hence 
2h^  — INj  2 1 

Pr{r"‘'i  (B*  - 0)  < z}  = Prtr"1*  - 02)  < z)  + o(l)  , 

_k 

so  that  r (0*  - Q)  has  the  required  limit  distribution.  By  Bickel  and  Yahav 
(1968),  to  complete  the  proof  it  suffices  to  show  that  for  some  r^  > 0, 

“ i 2 

14.2)  £ sup  Pr( r (0*  - 0 ) > m)  < *>  . 

m-1  0<r<rQ 

Since  N.  > ar  * (i  = 1,2),  our  definition  of  0*  shows  that 
l ’ N 
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Prtr'1 (Q*  - 02)2  > ra} 

£ Hr{  I Xj n - 0.  | > (mr)  2 for  some  n ar"1} 

__  I 

+ Pr { | X2n  - 0 | > (mr)  2 for  some  n >_  ar  } 

-2 

cQ  m (for  some  cQ  > 0)  , 

this  last  following  by  the  maximal  inequality  for  reverse  martingales  (Doob 
(1953)).  This  verifies  (4.2).  □ 

Proof  of  Lemma  7.  Under  our  conditions,  NL/Ng  -*■  1 (i  = 1,2).  Thus,  by  Lemma  3, 

_ 1, 

r 2 (0*  - 02)  — > The  rest  of  the  proof  now  follows  from  Bickel  and  Yahav 

(1968)  and  (4.2). 

5 . The  Case  of  More  Than  Two  Populations 

The  results  of  the  previous  sections  can  be  generalized  for  k >_  3 by  basic 

0i 

notational  changes.  Lemma  5 already  discusses  the  case  ~ r , with 
0 < B.  < h for  i = 1, . . . ,p  and  0^  ^ for  i = P+1, • • • ,k-l . Lemma  6 does  not 

change  if  p = k-1,  so  that  all  populations  but  one  may  be  eliminated.  Lemmas 
1-3  and  7 can  handle  the  mixture  situation  p < k - 1 with  some  simple  changes 
in  definition.  If  one  makes  the  further  reasonable  assumption  that 
Hk(Xj , . . . ,xk  l)  = Hk_p(xp+1> . . .,xk_],)  when  ^ = . . . = x^  = °°,  then,  as  in  Section 
4,  it  is  possible  to  show  that  Ng  and  the  elimination  procedure  lead  to  the  same 
asymptotic  MSE. 
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